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1.  Introduction 


Blumenthal  (1976)  considered  the  problem  of  sequential  estimation  of  the 
largest  of  k normal  means  when  a bound  is  set  on  the  acceptable  mean  square 
error.  He  showed  that  his  procedure  results  in  only  a small  savings  in  sample 
size  when  compared  to  a conservative  fixed  sample  procedure  for  the  case  of 
known  variance.  Carroll  (1978)  criticized  this  procedure  because  it  does  not 
give  the  user  the  flexibility  of  sampling  selectively  from  the  k populations. 
Carroll  (1978)  defined  a procedure  which  early  in  the  experiment  eliminates 
from  further  consideration  those  populations  which  are  obviously  not  associated 
with  the  largest  mean  and  hence  provide  little  relevant  information;  his  theo- 
retical large-sample  calculations  indicate  possible  large  savings  in  sample 
size  with  no  corresponding  increase  in  mean  square  error.  In  this  paper  we 
contrast  the  small  sample  behavior  of  the  two  approaches  by  means  of  a Monte- 
Carlo  simulation  study;  both  known  and  unknown  variance  are  considered. 

2 . Known  Variance 

We  are  dealing  with  independent  identically  distributed  observations 

Xi i» Xi 2 » • • - ^rom  the  ith  population,  i = 1,2.  These  are  assumed  to  be  normally 

2 

distributed  with  means  y^  and  and  common  variance  a . The  goal  is  to  estimate 
the  larger  mean  y#=max(yj ,y2)  with  a prespecified  bound  on  the  mean  square  error 
(MSF.)  r.  The  asymptotic  theorems  in  Blumenthal  (1976)  and  Carroll  (1977)  take  place 
as  r + 0.  If  A = max(yj,y2)  - minfyj.y^  = |yj-y2| , the  mean  square  error  for 
estimating  y by  the  larger  sample  mean  based  on  n observations  can  be  written 
as 

MSE  = (o2 * */n)  Hfn5  A /a)  . 
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In  order  to  control  the  MSE  at  a prespecified  level  r when  o is  known, 
Blumenthal  (1976]  defined  the  following  stopping  time. 

Definition  2.1.  After  obtaining  m observations  from  each  population,  estimate 
A by  |"X  - X.  | = A and  define  n(m)  = inf{n  > n„:  (a2/ n)  H(n2  A /a)  < r} 

and  define 

N„  = inflm  > m„:  n(m)  < m}  . 

D — U — 

Because  for  k=2  populations  the  risk  is  a decreasing  function  of  the  sample 
size,  one  can  show  that 

NB  = inf{n  >_  n0:  |-2-|  ^H(n2  An/a) } . 

Carroll  (1978)  has  shown  that  Blumenthal's  procedure  Ng  is  inefficient  in 
that  it  does  not  make  use  of  all  the  information  available  in  the  data.  In 
particular,  it  does  not  recognize  cases  when  one  population  is  obviously 
associated  with  the  smaller  mean.  Carroll  (1978)  defined  a procedure  which 
attempts  to  recognize  this  situation  and  stop  sampling  (early  in  the  experi- 
ment) for  populations  which  provide  information  about  p*.  The  idea  is  based  on 

a technique  of  Swanepoel  and  Geertsema  (1976)  and  can  be  described  fully  as 
2 

follows.  We  take  o = 1 throughout. 

Step  *1.  Choose  a small  value  a,  which  is  the  probability  of  falsely  elimi- 
nating the  population  associated  with  the  larger  mean.  Letting  $(<(>)  he  the 
standard  normal  distribution  (density)  function,  define  b = b(u)  by 

1 - 4>(b)  + b<J>(b)  + <j>2(b)/  <J>(b)  = a . 

Step  *2.  Define  a stopping  rule 

Ng  = inf{n  _>  nQ:  Ar  2 1 ((b2  ♦ log  n)/n)  2 . 
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Step  #3.  Define  the  stopping  time  N(cx)  as  follows.  For  a given  r,  if  [•]  is 
the  greatest  integer  function,  we  will  take  Ng  observations  from  each  popula- 
tion if  Ng  £ Np  (no  elimination  necessary).  If  N[:  < Ng  (elimination  necessary), 
we  take 


Np  observations  from  the  population  with  smaller  mean 
[l/r]+2  observations  from  the  population  with  larger  mean. 

The  total  sample  size  is  N(a).  Note  that  N(0.0)  = 2Ng,  so  Blumenthal's  proce- 
dure can  be  read  off  from  the  case  a = 0.0  . We  chose  n = [min  1 1 ( x ) / r ] - 1 . 

0 x 

In  order  to  investigate  the  small  sample  performance  of  N(a),  we  conducted 
a Monte-Carlo  experiment  with  500  iterations  and  various  choices  of  a,r  and  A. 
In  Tables  1-4  we  record  the  following  information. 

(1)  Average  value  of  N(a). 

(2)  N(a)r 

(3)  Bias 

(4)  Mean  square  error  divided  by  r.  This  should  be  no  more  than  1 if  we 
are  to  meet  our  goal  of  controlling  MSE  by  the  bound  r. 

The  conclusion  one  can  make  from  the  information  in  Tables  1-4  is  obvious; 
using  elmination  results  in  smaller  (sometimes  much  smaller)  sample  sizes  with 
no  real  increase  in  bias  or  mean  square  error. 


3.  Unknown  Variance 


For  the  case  that  the  variance  is  unknown,  the  stopping  time  Ng  changes 


only  in  that  a is  now  estimated  by 

\ ' c-1*'1  j,  <xn  - A 

1 = 1 


_ - X.  + X,  )2  . 
2 In  2n 
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The  stopping  time  is  again  suggested  by  Swanepoel  and  Geertsema  (1976) . 

For  a given  a,  we  are  going  to  take  nQ  _>  5.  Define 

t = .2(1  + a2/4)5 
1 ~ ^4 (a)  + af4(a)  = a > 

where  F^ff^)  is  the  distribution  (density)  function  of  a t distribution  with 
four  degrees  of  freedom.  Define 

h(a,n)  = l(tn)1/11  - l]"2  . 

Then 

Ne  = inf (n  _>  nQ:  |Xln  - X2n|  > h(a,n)sn>  . 

The  results  of  a Monte-Carlo  experiment  for  this  stopping  time  are  given 
in  Tables  5-8. 

The  conclusion  is  the  same  as  the  case  of  variance  known.  Using  elimina- 
tion decreases  sample  size  without  materially  changing  bias  or  mean  square  error. 
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Table  1 


Average  sample 

size  when 

the  variance  is 

known . 

A=2 .00 

A=1 .00 

A= . 20 

r = .10 

18.5 

1°.5 

19.0 

r = .05 

28.1 

35.4 

37.2 

r = .02 

57.7 

70.2 

91.4 

r = .01 

107.3 

120.4 

183.8 

a = .01 

r = .10 

18.4 

19.7 

19.0 

r = .05 

28.9 

37.4 

37.4 

r = .02 

58.8 

76.6 

92.7 

r = .01 

108.7 

127.3 

188.4 

a = .00 

r = .10 

21.1 

19.9 

19.0 

r = .05 

41.9 

40.2 

37.5 

r = .02 

102.0* 

101.4 

93.0 

r = .01 

202.0* 

202.0* 

189.35 

★ 


denotes  maximum  possible  sample  size. 


r 
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Table  2 

Average  sample  size  times  r when  the  variance  is  known. 


A=2 . 00 

A=1 .00 

o 

<N 

II 

<3 

a = .05 

r = 

.10 

1.85 

1.95 

1.90 

r = 

.05 

1.41 

1.77 

1.86 

r = 

.02 

1.15 

1.40 

1.83 

r = 

.01 

1.07 

1.20 

1.84 

a = .01 

r = .10 

1.84 

1.97 

1.90 

r = .05 

1.45 

1.87 

1.87 

r = .02 

1.18 

1.53 

1.85 

r = .01 

1.09 

1.27 

1.88 

a = . 00 

r = .10 

2.11 

1.99 

1.90 

r = .05 

2.10 

2.01 

1.87 

r = .02 

2.04 

2.03 

1.86 

r = .01 

2.02 

2.02 

1.89 
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Table  3 

2 

Bias  x io  When  the  variance  is  known. 


A=2 .00 

A=1 .00 

o 

<N 

II 

< 

.05 

r = .10 

1.5 

.6 

10.6 

r = .05 

.6 

.4 

5.0 

r = .02 

-.3 

-.3 

1.1 

r = .01 

-.1 

-.5 

-.3 

.01 

r = .10 

.9 

1.1 

10.7 

r = .05 

.6 

.3 

5.0 

r = .02 

-.3 

-.3 

1.1 

r = .01 

-.4 

-.5 

-.2 

00 

o 

rH 

II 

.8 

1.5 

10.7 

r = .05 

.5 

.6 

5.1 

r = .02 

-.3 

-.3 

1.1 
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Table  4 

Mean  square  error  divided  by  r when  the  variance  is  known. 


A=2 . 00 

A=1 .00 

A= . 20 

a = .05 

r = 

.10 

.89 

.96 

.86 

r = 

.05 

.88 

.92 

.76 

r = 

.02 

.92 

.92 

.85 

r = 

.01 

1.02 

1.02 

.95 

a = .01 

r = 

.10 

.91 

.99 

.87 

r = 

.05 

.88 

.93 

.78 

r = 

.02 

.92 

.93 

.84 

r = 

.01 

1.02 

1.02 

.94 

a = .00 

r = 

.10 

.96 

1.02 

.87 

r = 

.05 

.93 

.97 

.78 

r = 

.02 

.93 

.93 

.85 

r = 

.01 

1.01 

1.01 

.94 

r 
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Table  5 

Average  sample  size  when  the  variance  is  unknown. 


A=2 . 00 

A=1 .00 

O 

<NJ 

II 

<3 

a = .05 

r = .10 

23.7 

25.0 

22.8 

r = .05 

34.5 

50.4 

47.9 

r = .02 

64.4 

89.5 

126.7 

r = .01 

114.4 

138.3 

261.8 

a = .01 

r = .10 

25.4 

25.0 

22.7 

r = .05 

41.1 

53.4 

48.0 

r = .02 

70.4 

109.7 

127.3 

r = .01 

120.4 

156.5 

263.8 

P 

It 

o 

o 

r = .10 

25.4 

25.0 

22.7 

r = .05 

53.9 

53.8 

47.9 

r = .02 

97.8 

139.6 

127.3 

r = .01 

146.7 

241.5 

263.8 

* indicates  maximum  possible  sample  size  obtained. 


r 


Average 


a = .05 


a = .01 


a = .00 
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Table  6 

sample  size  times  r when  the  variance  is  unknown. 


A=2 . 00 

A=1 . 00 

A= . 20 

r = 

.10 

2.37 

2.50 

2.28 

r = 

.05 

1.72 

2.52 

2.40 

r = 

.02 

1.29 

1.79 

2.53 

r = 

.01 

1.14 

1.38 

2.62 

r = .10 

2.54 

2.50 

2.27 

r = .05 

2.06 

2.67 

2.40 

r = .02 

1.41 

2.19 

2.55 

r = .01 

1.20 

1.56 

2.64 

r = .10 

2.54 

2.50 

2.27 

r = .05 

2.69 

2.69 

2.40 

r = .02 

1.96 

2.79 

2.55 

r = .01 

1.47 

2.42 

2.64 

I 


, 12 

| 

I 

| 

Table  7 

2 

» Bias  x 10  when  the  variance  is  unknown. 
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Table  8 


L 

Mean 

square  error 

divided  by  r 

when  the  variance 

is  unknown 

A=2 .00 

A=1 . 00 

A= . 20 

a 

= .05 

r = .10 

.87 

.87 

.71 

r = .05 

.81 

.74 

.55 

r = .02 

.96 

.88 

.62 

r = .01 

.93 

.92 

.64 

! ci 

= .01 

r = .10 

.86 

.89 

.72 

r = .05 

.78 

.70 

.55 

r = .02 

.96 

.81 

.62 

r = .01 

.93 

.90 

.63 

; a 

= .00 

r = .10 

.86 

.89 

.73 

r = .05 

.72 

.72 

.58 

r = .02 

.92 

.66 

.62 

r = .01 

.93 

.77 

.63 

A 


